An Approach to Improve the Smoothing Process Based on Non-uniform Redistribution

نویسندگان

  • Feng-Long Huang
  • Ming-Shing Yu
چکیده

In the paper, an effective technique, based on the non-uniform redistribution probability for novel events (the unknown events), to improve the smoothing method in language models is proposed. Basically, there are two processes in the smoothing methods: 1) discounting and 2) redistributing. Instead of uniform probability assignment to each unseen events used by most smoothing methods, we propose new technique to improve the redistribution process. Referring to the probabilistic behavior of all seen events, the redistribution process for novel events in our method is non-uniform. The proposed technique is exploited on well-known and frequently-used Good-Turing smoothing method. The empirical results are demonstrated and analyzed for two n-gram models. The improvement is apparent and effective for smoothing methods, especially on higher unseen event rate.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A new adaptive exponential smoothing method for non-stationary time series with level shifts

Simple exponential smoothing (SES) methods are the most commonly used methods in forecasting and time series analysis. However, they are generally insensitive to non-stationary structural events such as level shifts, ramp shifts, and spikes or impulses. Similar to that of outliers in stationary time series, these non-stationary events will lead to increased level of errors in the forecasting pr...

متن کامل

Presentation and Solving Non-Linear Quad-Level Programming Problem Utilizing a Heuristic Approach Based on Taylor Theorem

The multi-level programming problems are attractive for many researchers because of their application in several areas such as economic, traffic, finance, management, transportation, information technology, engineering and so on. It has been proven that even the general bi-level programming problem is an NP-hard problem, so the multi-level problems are practical and complicated problems therefo...

متن کامل

SEISMIC ENERGY DEMANDS OF INELASTIC BUILDINGS DESIGNED WITH OPTIMUM DISPLACEMENT-BASED APPROACH

In present study, the effects of optimization on seismic energy spectra including input energy, damping energy and yielding hysteretic energy are parametrically discussed. To this end, 12 generic steel moment-resisting frames having fundamental periods ranging from 0.3 to 3s are optimized by using uniform damage and deformation approaches subjected to a series of 40 non-pule strong ground motio...

متن کامل

Analytical predictions for the buckling of a nanoplate subjected to non-uniform compression based on the four-variable plate theory

In the present study, the buckling analysis of the rectangular nanoplate under biaxial non-uniform compression using the modified couple stress continuum theory with various boundary conditions has been considered. The simplified first order shear deformation theory (S-FSDT) has been employed and the governing differential equations have been obtained using the Hamilton’s principle. An analytic...

متن کامل

External and Internal Incompressible Viscous Flows Computation using Taylor Series Expansion and Least Square based Lattice Boltzmann Method

The lattice Boltzmann method (LBM) has recently become an alternative and promising computational fluid dynamics approach for simulating complex fluid flows. Despite its enormous success in many practical applications, the standard LBM is restricted to the lattice uniformity in the physical space. This is the main drawback of the standard LBM for flow problems with complex geometry. Several app...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005